110 research outputs found

    Alternative pre-mRNA processing regulates cell-type specific expression of the IL4l1 and NUP62 genes

    Get PDF
    BACKGROUND: Given the complexity of higher organisms, the number of genes encoded by their genomes is surprisingly small. Tissue specific regulation of expression and splicing are major factors enhancing the number of the encoded products. Commonly these mechanisms are intragenic and affect only one gene. RESULTS: Here we provide evidence that the IL4I1 gene is specifically transcribed from the apparent promoter of the upstream NUP62 gene, and that the first two exons of NUP62 are also contained in the novel IL4I1_2 variant. While expression of IL4I1 driven from its previously described promoter is found mostly in B cells, the expression driven by the NUP62 promoter is restricted to cells in testis (Sertoli cells) and in the brain (e.g., Purkinje cells). Since NUP62 is itself ubiquitously expressed, the IL4I1_2 variant likely derives from cell type specific alternative pre-mRNA processing. CONCLUSION: Comparative genomics suggest that the promoter upstream of the NUP62 gene originally belonged to the IL4I1 gene and was later acquired by NUP62 via insertion of a retroposon. Since both genes are apparently essential, the promoter had to serve two genes afterwards. Expression of the IL4I1 gene from the "NUP62" promoter and the tissue specific involvement of the pre-mRNA processing machinery to regulate expression of two unrelated proteins indicate a novel mechanism of gene regulation

    The 3of5 web application for complex and comprehensive pattern matching in protein sequences

    Get PDF
    BACKGROUND: The identification of patterns in biological sequences is a key challenge in genome analysis and in proteomics. Frequently such patterns are complex and highly variable, especially in protein sequences. They are frequently described using terms of regular expressions (RegEx) because of the user-friendly terminology. Limitations arise for queries with the increasing complexity of patterns and are accompanied by requirements for enhanced capabilities. This is especially true for patterns containing ambiguous characters and positions and/or length ambiguities. RESULTS: We have implemented the 3of5 web application in order to enable complex pattern matching in protein sequences. 3of5 is named after a special use of its main feature, the novel n-of-m pattern type. This feature allows for an extensive specification of variable patterns where the individual elements may vary in their position, order, and content within a defined stretch of sequence. The number of distinct elements can be constrained by operators, and individual characters may be excluded. The n-of-m pattern type can be combined with common regular expression terms and thus also allows for a comprehensive description of complex patterns. 3of5 increases the fidelity of pattern matching and finds ALL possible solutions in protein sequences in cases of length-ambiguous patterns instead of simply reporting the longest or shortest hits. Grouping and combined search for patterns provides a hierarchical arrangement of larger patterns sets. The algorithm is implemented as internet application and freely accessible. The application is available at . CONCLUSION: The 3of5 application offers an extended vocabulary for the definition of search patterns and thus allows the user to comprehensively specify and identify peptide patterns with variable elements. The n-of-m pattern type offers an improved accuracy for pattern matching in combination with the ability to find all solutions, without compromising the user friendliness of regular expression terms

    SMART amplification combined with cDNA size fractionation in order to obtain large full-length clones

    Get PDF
    BACKGROUND: cDNA libraries are widely used to identify genes and splice variants, and as a physical resource for full-length clones. Conventionally-generated cDNA libraries contain a high percentage of 5'-truncated clones. Current library construction methods that enrich for full-length mRNA are laborious, and involve several enzymatic steps performed on mRNA, which renders them sensitive to RNA degradation. The SMART technique for full-length enrichment is robust but results in limited cDNA insert size of the library. RESULTS: We describe a method to construct SMART full-length enriched cDNA libraries with large insert sizes. Sub-libraries were generated from size-fractionated cDNA with an average insert size of up to seven kb. The percentage of full-length clones was calculated for different size ranges from BLAST results of over 12,000 5'ESTs. CONCLUSIONS: The presented technique is suitable to generate full-length enriched cDNA libraries with large average insert sizes in a straightforward and robust way. The representation of full-coding clones is high also for large cDNAs (70%, 4–10 kb), when high-quality starting mRNA is used

    Automated production of recombinant human proteins as resource for proteome research

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>An arbitrary set of 96 human proteins was selected and tested to set-up a fully automated protein production strategy, covering all steps from DNA preparation to protein purification and analysis. The target proteins are encoded by functionally uncharacterized open reading frames (ORF) identified by the German cDNA consortium. Fusion proteins were produced in <it>E. coli </it>with four different fusion tags and tested in five different purification strategies depending on the respective fusion tag. The automated strategy relies on standard liquid handling and clone picking equipment.</p> <p>Results</p> <p>A robust automated strategy for the production of recombinant human proteins in <it>E. coli </it>was established based on a set of four different protein expression vectors resulting in NusA/His, MBP/His, GST and His-tagged proteins. The yield of soluble fusion protein was correlated with the induction temperature and the respective fusion tag. NusA/His and MBP/His fusion proteins are best expressed at low temperature (25°C), whereas the yield of soluble GST fusion proteins was higher when protein expression was induced at elevated temperature. In contrast, the induction of soluble His-tagged fusion proteins was independent of the temperature. Amylose was not found useful for affinity-purification of MBP/His fusion proteins in a high-throughput setting, and metal chelating chromatography is recommended instead.</p> <p>Conclusion</p> <p>Soluble fusion proteins can be produced in <it>E. coli </it>in sufficient qualities and μg/ml culture quantities for downstream applications like microarray-based assays, and studies on protein-protein interactions employing a fully automated protein expression and purification strategy. Future applications might include the optimization of experimental conditions for the large-scale production of soluble recombinant proteins from libraries of open reading frames.</p

    Extending pathways based on gene lists using InterPro domain signatures

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>High-throughput technologies like functional screens and gene expression analysis produce extended lists of candidate genes. Gene-Set Enrichment Analysis is a commonly used and well established technique to test for the statistically significant over-representation of particular pathways. A shortcoming of this method is however, that most genes that are investigated in the experiments have very sparse functional or pathway annotation and therefore cannot be the target of such an analysis. The approach presented here aims to assign lists of genes with limited annotation to previously described functional gene collections or pathways. This works by comparing InterPro domain signatures of the candidate gene lists with domain signatures of gene sets derived from known classifications, e.g. KEGG pathways.</p> <p>Results</p> <p>In order to validate our approach, we designed a simulation study. Based on all pathways available in the KEGG database, we create test gene lists by randomly selecting pathway genes, removing these genes from the known pathways and adding variable amounts of noise in the form of genes not annotated to the pathway. We show that we can recover pathway memberships based on the simulated gene lists with high accuracy. We further demonstrate the applicability of our approach on a biological example.</p> <p>Conclusion</p> <p>Results based on simulation and data analysis show that domain based pathway enrichment analysis is a very sensitive method to test for enrichment of pathways in sparsely annotated lists of genes. An R based software package <it>domainsignatures</it>, to routinely perform this analysis on the results of high-throughput screening, is available via Bioconductor.</p

    Statistical methods and software for the analysis of highthroughput reverse genetic assays using flow cytometry readouts

    Get PDF
    Highthroughput cell-based assays with flow cytometric readout provide a powerful technique for identifying components of biologic pathways and their interactors. Interpretation of these large datasets requires effective computational methods. We present a new approach that includes data pre-processing, visualization, quality assessment, and statistical inference. The software is freely available in the Bioconductor package prada. The method permits analysis of large screens to detect the effects of molecular interventions in cellular systems

    Systematic analysis of T7 RNA polymerase based in vitro linear RNA amplification for use in microarray experiments

    Get PDF
    BACKGROUND: The requirement of a large amount of high-quality RNA is a major limiting factor for microarray experiments using biopsies. An average microarray experiment requires 10–100 μg of RNA. However, due to their small size, most biopsies do not yield this amount. Several different approaches for RNA amplification in vitro have been described and applied for microarray studies. In most of these, systematic analyses of the potential bias introduced by the enzymatic modifications are lacking. RESULTS: We examined the sources of error introduced by the T7 RNA polymerase based RNA amplification method through hybridisation studies on microarrays and performed statistical analysis of the parameters that need to be evaluated prior to routine laboratory use. The results demonstrate that amplification of the RNA has no systematic influence on the outcome of the microarray experiment. Although variations in differential expression between amplified and total RNA hybridisations can be observed, RNA amplification is reproducible, and there is no evidence that it introduces a large systematic bias. CONCLUSIONS: Our results underline the utility of the T7 based RNA amplification for use in microarray experiments provided that all samples under study are equally treated

    The systematic functional characterisation of Xq28 genes prioritises candidate disease genes

    Get PDF
    BACKGROUND: Well known for its gene density and the large number of mapped diseases, the human sub-chromosomal region Xq28 has long been a focus of genome research. Over 40 of approximately 300 X-linked diseases map to this region, and systematic mapping, transcript identification, and mutation analysis has led to the identification of causative genes for 26 of these diseases, leaving another 17 diseases mapped to Xq28, where the causative gene is still unknown. To expedite disease gene identification, we have initiated the functional characterisation of all known Xq28 genes. RESULTS: By using a systematic approach, we describe the Xq28 genes by RNA in situ hybridisation and Northern blotting of the mouse orthologs, as well as subcellular localisation and data mining of the human genes. We have developed a relational web-accessible database with comprehensive query options integrating all experimental data. Using this database, we matched gene expression patterns with affected tissues for 16 of the 17 remaining Xq28 linked diseases, where the causative gene is unknown. CONCLUSION: By using this systematic approach, we have prioritised genes in linkage regions of Xq28-mapped diseases to an amenable number for mutational screens. Our database can be queried by any researcher performing highly specified searches including diseases not listed in OMIM or diseases that might be linked to Xq28 in the future

    Differential expression of apoptotic genes PDIA3 and MAP3K5 distinguishes between low- and high-risk prostate cancer

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>Despite recent progress in the identification of genetic and molecular alterations in prostate cancer, markers associated with tumor progression are scarce. Therefore precise diagnosis of patients and prognosis of the disease remain difficult. This study investigated novel molecular markers discriminating between low and highly aggressive types of prostate cancer.</p> <p>Results</p> <p>Using 52 microdissected cell populations of low- and high-risk prostate tumors, we identified via global cDNA microarrays analysis almost 1200 genes being differentially expressed among these groups. These genes were analyzed by statistical, pathway and gene enrichment methods. Twenty selected candidate genes were verified by quantitative real time PCR and immunohistochemistry. In concordance with the mRNA levels, two genes <it>MAP3K5 </it>and <it>PDIA3 </it>exposed differential protein expression. Functional characterization of <it>PDIA3 </it>revealed a pro-apoptotic role of this gene in PC3 prostate cancer cells.</p> <p>Conclusions</p> <p>Our analyses provide deeper insights into the molecular changes occurring during prostate cancer progression. The genes <it>MAP3K5 </it>and <it>PDIA3 </it>are associated with malignant stages of prostate cancer and therefore provide novel potential biomarkers.</p
    • …
    corecore